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iiemteobUsticss or the Student's t-test is investigated 
Mmaderethie Violation of the assumption of equality of | 
fomtances meenvith the aid of computer simulation, Type I 
momlype imerrorerates and the resulting statistical 
inference are studied and the effects of unequal variances 
Gieucwceumonerates and the power of the test are determined. 
MifercmaGe Ggetermined om the degree of violation of the 
Sgee@ity Of Variances that still leads to a satisfactory 


mesuilt when Student's distribution is used. 
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INTRODUCTION 


Miminvesoledeime thie robustness of the Student's 
Gace wemeltmicoemccessary CO initially discuss the underlying 
Gmeiribution wsed by the test, the t distribution. Prior 
to 1908 statistical analysis was greatly dependent on 
knowing the population variance o” for most procedures. 
The random variable 

Z= (x - u)/n meal 
fe) 

was used extensively. To develop z, the hypothesized 
Population mean W 1s subtracted from the sample mean x 
Mice tie resulting value 1s multiplied by the square root 
© the Sample size n and divided by the population standard 
fewiaploneows the statistic z has a normal distribution 
Piiveaemean zerouand Standard deviation equal to one, N(0,1), 
Pe ismtistr putea mormally with mean equal to yp and 
emiddnasdevlation equal to o°, i.e., N(u, o*). When x 
iMemamyedistribution other than N(i, o*), then z approaches 
om iUmieas n>? according to the central limit theorem. 

inmetI0e Gosset, publishing under the pseudonym of 
SP oldcemaumdeveloped a procedure which modified 2 for 
instances where the population variance o* was unknown. 


He estimated o° using the unbiased estimator 


n 
sz = le ee) 





Gosset then considered the random variable 


fo oe i) vn 
Sy Iba s 

Pemvevcre(/) motes, the probability distribution of the 
fandom Variable &£ is more complicated than that of z 
because both the numerator and denominator of t are random 
Mimbables whereas z 1s simply a linear function of the 
random sample Xqoeee Xe, 

iimanieect ont tO Obtain the probability distribution 
Ot, Gosset considered these facts: 

een asmaeN( 0,1) > distribution. 


ee n 


v= = (x, = hee has a Chi-square distribution 
ae (ii 1) eee of freedom. 
5. Zz and v are independent random variables. 
he detained 
t = Z al = ae il 1-4 
YV/d 


PEameEOUndetine probability density function (pdf) of t as 
given by 


Comme yy + 42) 7 (2 * 20/2. wet cos 
T(d/2) / ad d | 


where T denotes a Gamma function where T (n+l) = n! = 
ie or dee nas ecdasctribution 1s known as the Student's 


t-distribution with d degrees of freedom. 








The pdf hg is symmetric with a mean of zero and 


resembles the normal distribution. Dixon and Massey (3) 


show that even though on the average sé 


mere than half the time Se 


ESmequale toro]. 
weeadctually less than o- 
because of the kurtosis of the distribution of a 

Lindley (14) has proven through a rigorous mathematical 
areument that as the sample size n becomes large the 
femst@tiy ot the £t distribution tends to have a distribution 
reO, 1). 

Because of its importance, especially as the underlying 
Sesctripution tor the Student's t-test, the t-distribution 
ims been tabulated. 

In the problem of testing the hypothesis that the 
means of two normal populations are equal the most commonly 
Meeoetccteilemtne otudent's t-test. The test as developed 


by Gosset formulates the following random variable: 


t= een fret iy 
n n 
x 
(in. - I) ae a (ny - 1) ae e 
Mas ae a 
1-6 


where Meee ny are defined as the sample sizes drawn 


respectively from normal populations X and Y. 


we 


; ond 





The variables x and y are the sample means of the 
populations X and Y respectively and Se and Sf ates tile 
Miibtbasca sample Variances of the X and Y populations 
iesmeeuilvely. 

Piesunderiyvinesdistribution for this statistic has the 
same t-distribution as the statistic shown in (1-3) because 
X - y iS a normal random variable and the entire denominator 
is a pooling of the sums of the squared deviations from the 
means Of both samples which provides the best unbiased 
estimate of the common population variance. 

TOevestethe typothesis the absolute value of the 
[eetatistilce compiled from the samples is compared to a 
Pore tienlarevalue from the t distribution which has 
ew Octatedewith 1 a probability of a more extreme value. 
Where the observed absolute value of t, |t |, is greater 
than the tabulated |t| value, a hypothesis that the two 
population means are equal, is rejected. However, if the 


value of the observed |t Seaecistic us Less than the 


al 
tabulated |t| value, the hypothesis is accepted. 
Mimondcreromuse this particular test for equality of 
Meatc,eas intended, the theory requires certain assumptions 
DemilecGeme ine first assumption dictates that the random 
samples drawn from each population must be independent. 
Secondly, Gosset stated that the underlying populations 
from which the samples are taken must be normally distributed. 


The third and seemingly most severe assumption, is that the 


Vitianeceseaf both populations must be equal. 





iMMesoepaper is concerned with a detailed empirical 
Puc wotetine adpbility of the t-test to give correct results 
to the question of whether or not the means of two normal 
populations are equal when the third assumption of equal 
Meetances 15 violated. The robustness of the t-test, or 
its ability to withstand this violation of assumption is 
mmestigated for various degrees of violation of the 
Pocumorlonm ©f equal variances. Under this condition, 
Sertaim error rates are investigated. One type of error 
Mmieemtontne traction of instances the test implies that 
the means of two normal populations are not equal when 
/mmract they are equal. the second type of error rate 
/mmcnewitact1on of instances that the test implies that 
the means of the two normal populations are equal when 
Mmicemeneveare NOt equal. The power of the t-test or 
Pema bimlty sto cgetect the difference between two population 
Tegicemiseaerunction of the second type of error rate and 
mmecialeto one minus the fraction of errors of the second 
kind. | 

Wieminvestigation of these error rates is conducted 
froremeth equal and unequal sample sizes and the ratio 
of the population variances is allowed to vary over a 


wide range of values. 





II. BACKGROUND 


ie of vio heel eINPERENCE AND HYPOTHESIS TESTING 

Miewcveivation, of the robustness and power of a test 
Paomnes Some Elementary knowledge in the area of statistical 
Mmmeteonce andwespecially hypothesis testing. Generally the 
observations or random samples drawn from one or more 
populations are arithmetically manipulated by a particular 
inedstO Obtain information about the underlying populations. 
Ths single number calculated from sample data is referred 
momooea Statistic. From this statistic certain inferences 
Goammebe Madde about cither a particular parameter of a single 
population under study or whether equality exists between 
Pie same parameters of two or mere populations. 

The wtstest falls into the second major area of 
Tatelwledieingerence Called hypothesis testing. The test 
momapolled tO the common statistical problem of determining 
Tiewler Or NOt the means of two normally distributed popula- 
tions are equal. The test begins with Ene ior nese that 
the means are equal and then from the value of the statistic, 
the decision is made whether the hypothesis is accepted or 
Pfc rLom tne t Statistic developed in 1-6 it should 
Depobsetved that im testing the hypothesis the direct 
Someermas mot With determining the actual value of the 
means of the two distributions but instead in determining 


whether a difference exists between the two means. 





Wijememi@acecettain basic properties that any method 
Msecanwtorenypothesis testing must be required to possess. 
ie sttrst property is that when any hypothesis test is used 
there should exist only a small probability that the results 
Obtained from the method lead to an erroneous conclusion. 
inmotier words, im the case of the t-test, if indeed the 
Means are equal, there should be only a small probability 
mat when applying the test the statistical inference leads 
[Pomtne assertion that the means are not equal. The second 
~Pequirement states, that if a difference does exist between 
imeetwo means, there should be a very high probability that 
imimmseiact 1s detected by the test. Sverdrup (26) points 
Out that in effect these two requirements are competing 
With one another, and in choosing any test of hypothesis 
both considerations must be balanced against one another. 
On one hand there is a strong desire to claim that the two 
Meamwomare ecoqual when in fact they are equal. However, at 
the same time an equally strong desire exists which con- 
Pewetates Om detecting the smallest possible difference 
between the two means in an attempt to assert that the 
two means are not equal when they are not equal. If the 
PME codMiremeit 1s too strongly adhered to then the 
PeObAalMMbyeet Getecting a difference between the means 
when it exists is decreased, thereby weakening the second 


requirement. Conversely, when the test attempts to detect 


10 





extremely small differences between the two populations 
means, the probability of asserting that the means are 
Meteecaqual, when in fact they are equal, will increase. 

in Nypothesis testing a statement whose erroneous 
mefectmonelt 1S particularly desirable to avoid, is 
called the null hypothesis, and is generally denoted by 
Ho: imsthe ease Of the t-test the null hypothesis is 
therefore the statement that the means of the two popula- 
ilmomseare equal. If the’ means are equal it is not 
Meciraple to conclude from statistical inference that they 
omeenotrvequal, If the means are truly not equal it is not 
@esirable to conclude that they are after using the test. 


Mhitrs situation 1s schematically shown in Table 1. 


abie 1 
ERRORS IN HYPOTHESIS TESTING 


TRUE SITUATION 
NULL HYPOTHESIS NULL HYPOTHESIS 


TRUE FALSE 


ACCEPT NULL 
MY POTHESES NO ERROR IYPE Et ERROR 
TEST 

INDICATES 


REJECT NULL 
mer OlnEslS 


ieee TERKOR NO ERROR 





ial 





oi wemlmer ror eresults when the null hypothesis is 
mewieeucauewhen im £act 1t 1s true and a Type II error 
mesulese wien the null hypothesis 1 accepted When Intact 
Rieromiaise.  coymbolically the probability of making a 
imgemiwernor is denoted by a and the probability of 
Goumittaneg a lype 1] error is denoted by 8. The 
Probabilities associated with making a Type I or Type II 
error should be as small as possible. 

Piewcritical importance in understanding these two 
meberia 15 the fact that they will be the basis of the 
eveluation for the t-test during this study. When two 
peeilacions meet all three of the assumptions necessary 
memesemOteche £-test, the test results in a certain fraction 
Geplype lo and Type Il errors which are unavoidable. This 
investigation examines in detail how these fractions 
change when the assumption of equal variances is violated. 
Be SIGNIFICANCE LEVEL 

iiewtabulateast value mentioned earlier will now be 


mipeomned to as the critical t value or t Thespanrereular 


Gace 


vale of iemehcosenusuch that a fraction © of the 


Grut 
Moarambicrondiovaluwes Of the t distribution lie beyond 


It Diisersethe result of having the null hypothesis 


ae 
Hotty = My and choosing the elternative hypothesis 


ee # a Midtmerrdetion of the distributional values 


Ivine eutsidée of |t 15 equal to q , the probability 


aire 


eescectated with a Type I error. 


WZ 





Piythe two population means are equal and the t value 
Pescuttang trom the t-test lies. outside of the interval 
(-t i 


Sree Pileweest produces a Type I error. This 


Crit” 
Pmciescntirelyeto chance with a probability equal to q 
SiGeclis type lecrror 1S unavoidable in an a fraction of 
Pme cases run. 

iiemsmenvticamee level of the test is equal to one 
Mes the probability of making a Type I error and is 
Wertten symbolically as l-a. 
e. ROWERTOF A TEST 

ice pmobabpi lity Of scommitting a Type Il error is 
femeteaqeDy Ba Ihis 15 the proportion of acceptances of 
fnesnull hypothesis when in fact the hypothesis should be 
macetcQ(mmlinc pewer Of any test 1s defined as 1-8. As B 
Miiebedscseine power decreases and conversely as 8 decreases 
PnempoweneOletne LeSt Increases. It results that when two 
normal population means are almost equal the power of the 
tees stall and the power increases as the difference in 
Giemicans increases. As the difference between the means 
ameceinerease the power of the test asympototically 
appuoaches 1.0. When no difference exists between the 
population means then 6 equals l-a. 

iiempOwereOteaiy Statistical test 15 a function of 
Semuatnetactors. The principle factor influencing the power 


is the variance of the respective populations being tested. 


lS 





ieee seebetne evaluated could be influenced by the largest 
Wietanee Of the two populations, the magnitude of the 
difference between the two population variances or the size 
Geethnempooled variance for both populations. A second | 
Baecor int luvencing the power of a test is the size of 

the samples taken from both populations and whether or not 
these sample sizes are equal. The sample sizes have a 
Beeomemimidiwence on the size of the pooled variance. The 


pooled variance (pv) is defined as 


Z 2 
Ge 1s, * (ny-1) sy} 
we or ny? 
MiemMethewsauple sizes are equal the pooled variance is 
Sunplysone Walt of the sum of the variances from both 
Pepulations. When the sample sizes are not equal then 
piemcorze or the pooled variance is most effected by the 


Pale having the larger number of observations. 
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III. VIOLATION OF ASSUMPTIONS 


A. PREVIOUS INVESTIGATIONS 

ety weeweinvestigations have been carried out to study 
M@iemersecect Of Gependent random samples on the Student's 
fee st. Schetfe (25) discusses a violation of this 
M@aewre and proves that the effect of a serial correlation 
Gieainterence about means can be serious and, therefore, 
Should be considered when using the test. With respect 
to the normality assumption it is usually reasonable to 
assume normally distributed populations because even 
when populations are not normal Scheffe” (25) has demon- 
meedtcadethiat the cifect of a violation of this nature is 
wemyesticht when making inferences about means. 

The most interesting and most complex results arise 
widemethe assumption of equal variances is violated. 
Cireumstances Often exist where group to group homogeneity 
Gmevanlaneces is mot to be expected and is the exception 
maener than the rule. 

For the particular case where non-homogeneity of 
Veemances 15 known t0 exist, different methods have been 
Proncsed ds alternatives to the t-test. When the relative 
scale factor of the two populations is known appropriate 
weighting of the sums of squares gives an exact solution. 
In the case where the relative scale factor is unknown 


Gupte renereriteria have been advocated. 


Is 





Welch (30) has discussed in detail the often employed 


peececrnatkive Statistic 


He demonstrates that when Oy 2 # i ie Stat 1Starc 
femeloped) in 1-6 does not have an underlying t distribution 
pille@mecirat o-1 results in less bias than the general t 
Seterstic when the variances are not equal. 

Fisher (S$) has proposed another solution to the 
Poembem Of testing the hypothesis py =Uy using the 
Seieepe ot t£1ducial distributions but the validity of 
Hitoemapproach las been questioned by Bartlett (1). 

Each of these alternatives was developed because 
Miemeoimteii ion e€X1Sts that the t-test 1s not generally 
mipircable to testing the equivalence of means when the 
wartances Of the two populations may not be equal. This 
aad eiceimOteconcerned with comparing these alternatives 
PaCimeencmeotest. 1t will attempt to determine the necessity 
Gimisimg timese alternatives. The t-test may prove to be 
BObMStMenOourgh to withstand such a high degree of violation 
of the assumption of equal variances that these alternatives 


abeenOt MWecessary. 
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Welenmuzo) made the first detailed study of the t-test 
and its robustness when faced with a violation of the 
fecumption of equall variances. He concentrated on only 
the resulting a level and used an approximation method to 
Paeiverat his results. When the sample sizes were equal, 
Volemes conclusion was that the rejection rate arrived 
Mmmevaen tne Variances are different does not differ 
Teonicicantivyetrom the specified rate. The approximation 
used, set the variation of one population to zero and even 
minder this extreme condition the test never became seriously 
fiasced. in terms of frequencies, Welch has stated that 
for equal sample sizes and a difference in population 
variances, if the test were performed numerous times the 
MmPemeOL rejections of a true hypothesis would not be 
Pome tecantily ditterent than the actual number of expected 
PoeetlOons boned prescribed q level. Using the t-test 
asyan example, if the test were applied many times to two 
normal populations with equal means, the number of Type I 
errors Peeve vould=epe cdtial to the fraction aq of the 
EE tienumber OL Tterations of the test. If the two popula- 
Pio aa tances were 1m fact different, approximately this 
Piifemuunnen sor expected Type I errors would result. There- 
fore, the violation of the assumption of equal variances 
does not bias the test seriously when the sample sizes are 
eJUiewmelitominvestigatilon attempts to verify empirically 


Bicmenueneor Welch's statements. 
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Welch also examined the case where the sample sizes 
were not equal. Using the same approximation method he 
made the following observations. When the larger sample 
has the larger variance the difference between the two 
Meals tends to be underestimated. This implies that the 
mmeibabllity of making a Type II error increases, and 
consequently the power of the test will decrease. When 
the larger sample has a smaller variance the difference 
imewwecn the two means tends to be overestimated and a 
Meertteiepercentage Of Type Il errors result. The foregoing 
mole could be Summarized to state that the true rejection 
Tates becomes significantly different than the specified 
rates for unequal sample sizes and unequal population 
variances. 

Gronow (9) likewise made an exhaustive study of the 
Megieetlon Tate Of the t-test when the assumption of equal 
iememoneces 1S violated. He used a different method of 
aeproximation then Welch, but his study resulted in confirming 
mint Welch had previously stated. A bias will result in the 
meiecelommate for populations with unequal variances and 
Uirterent Sample sizes. 

PMieoOthnpOL tmese Previous investigations, Welch and 
Gronow were hampered by the fact that they had to use an 
TiMoxriMatton method tO arrive at their conclusions. 


Consequently, they were forced to look at extreme cases and 


diss 





draw conclusions. The ratio of variances was set either 
at 0, 1 or wo, and then through a mathematical argument they 
Mmeaimved at a result. This approach leaves many fine points 
Unanswered. For instance, Welch used equal sample sizes 
of ten observations each and made his conclusions concerning 
miemtack Of biaS With respect to rejection rates. The 
@uestion Of what happens with rejection rates for equal but 
emabler Sizes remains Unanswered. Is there a variance 
mero large enough to cause the "true" rejection rate to 
feeerer significantly from the specified rate? For the 
Semesreason the use of extreme cases did not yield enough 
Peormatton tO draw definitive conclusions concerning 
Paempower Of the t-test under varying variance ratios. 
iicerapicsdcyelopment Of high speed computers within 
the last ten years has been largely responsible for making 
Gempatled stdies in this area more feasible. Murphy (19) 
Meeamcomnputer simulation to test the actual rejection rates 
while comparing the t-test to two alternatives, the Permutation 
Mest and the Aspin-Welch Test. At a specified a level of 
0.05 he substantiated Welch's and Gronow's work concerning 
Enewbias inherent in the test when the sample sizes differ 
and population variances are not equal. During his 
investigation, Murphy used 500 iterations for each case 


studied. 
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Be AREAS OF INVESTIGATION 

iiescepmevilous investigations into the characteristics 
fame wt-test did and encourage further study. The mathe- 
ftedeal nesults furnished by Welch and Gronow beg for | 
substantiating data in the form of numerous applications 
of the t-test under various degrees of violation of the 
assumption of equal variances. This investigation attempts 
to provide this needed data while it studies the effect 
Semumequal Variances on the robustness of the test. It 
should be restated that robustness of a test is concerned 
feeeietne fraction of Type I and Type II errors exhibited 
Peapeiestest. A study of Type i and Type II error rates 
and the power of the test determines the effect of this 
violation on robustness. 

iicmaceeculOlerates OL the test are studied for 
wam~me degrees Of unequal variances. The ratio of the 
GEvompopulatiion Variances is termed the scale factor k, 
emaetnis Scale factor is allowed to range over intervals 
determined from the investigation. With equal and unequal 
Paliples sizes an attempt 1s made to find the particular 
value of k, if one exists, where the actual or estimated 
Mise mie |eCctrholNratesalifers significantly from the 
epceitaeq ssa) level of 90.05. A second method for finding 
a particular k value is used. An accumulation of observations 


are made for certain other q levels and combining these 
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mere cmrcsuits in the formation of the tail of an empirical 
mreaquency distribution which is compared GOmtmMe 8Ga lilo t 
mee tieoretical t€ distribution to determine if the violation 
of the assumption of homogeneity of variance causes the 
Metest tO produce an empirical distribution which differs 
miemmrtcantly trom the t distribution. Once again the 
feeempt 1S made to find a particular value of k which marks 
empoimt where the empirical frequency distribution no longer 
Mewpallels the t distribution. 

iheminvesticatvion attempts to substantiate Welch's 
Somclusion that for unequal samples the t-test quickly 
becomes invalid under the violation of the assumption 
pemegual Variances, or to show that the validity of the 
ame rse Olly Violated at such an extreme scale factor that 
Mmerercet thie test 1s Valid in most circumstances. A 
fers Valid if it functions as intended with respect 
memuene tWO Criteria in hypothesis testing. This means 
fmate the values of g and g are the primary measures of 
emecetiveness £Or this investigation. | 

BiewpOWwermere the test 1S also investigated in the 
eect cqualsand unequal sample sizes. 1t is desirable 
fom@etermMines ti wthe power Of the test decreases as the 
ccolewideror varies from k=l, and further, if the power does 
Heeredce wacethe change due to the violation of the 
assumption of equal variances or is the decrease in some way 


related to the actual variance present in both samples? 
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ieee lhOUS AND TPROCEDURES 


Avs METHODS 

Computer simulation was used to carry out the 
investigation. The investigation took the form of pro- 
gramming numerous "cases" through the computer. Each 
@ase, which was iterated 50,000 times, consisted of the 
moulowing elements: 

I. Two samples drawn from each of two standard 
mommal populations, X and Y. The sample sizes were 
ny, and Ny » and ranged in size from five to fifteen obser- 
vations each and were not always equal. 

enescale factor k equal to the ratio of variances, 
af / ee where k was allowed to vary discretely over a 
determined range. The values of the variances from the 
Eyoenormal Standard populations, N(0,1) were adjusted to 
memmeve the desired scale factor. 

53. A difference in means of the two populations 
pen was allowed to range from zero to five, in 0.5 
iMmenecwents, which resulted in 11 different values. 

Momonimexcamnle.. 0 Single case would consist of n, = 10, 
ny = oon <a and 1, Wy 
iterations were performed and the following data were 


=o Selon tnms case, 50,000 


Paeieted., | the brejection rates for the critical vaiues 


mene memiTSGterTbUtlOn associated with o levels of 0.1, 


ZZ 





mone Z Oa 0L, and 0.001 were compiled. At the 

Mevel of 0.05, the estimate of the "true" rejection rate 
@¢ and the estimate of the "true" power of the test 1-8, 
were calculated 

fnitwalive S,000 iterations were performed for each 
@eoce [his was done to arrive at some indication of what 
value the scale factor had to obtain to force the test 
Momproduce invalid inferences. When this tentative scale 
factor was determined for each pair of sample sizes the 
number of iterations was increased to 50,000 and the 
scale factor was allowed to vary from one to this tentative 
Mmalue in increments of 0.25. 

Two different criteria were used to determine the 
Eeltdity Sot the t-test at various scale factor or k 
Vaiiesweefirst a study was made of the differences 
between the estimated "true’ rejection rate resulting 
Paoneche 20,000 iterations and the expected rejection 
maeenat a Single gq level of 0.05. These two rejection 
rates ere compared to determine at what k value they 
Peeanewsigmiricantly different. The test used to conduct 
titasecomparrson had a significance level of 0.975. 

The second method used to determine the "validity" 
of the t-test was more stringent then the comparison of 
rejection rates at a single a level. The second method 


Peckmie rejection rates compiled at the five a levels, 
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Pees 0s 0502, 0.01, and 0.001 and from these figures 
aireecuctea the tail of an empirical frequency distribution. 
This developed empirical distribution was then compared 

to the tail of the t distribution to determine at what 
feevelue the two distributions became significantly different. 
mmenieoquare Goodness-of-Fit Test with four degrees of 
freedom and a significance level of 0.975 was used to 

Conduct the comparison. 

Mescoecuring the 50,000 iterations for each case 
meieeestimated “true rejection Tike Shor we wilsex rors 
mesebeing compiled and converted into a value for the 
MewersOt the test. Appropriate cases were combined to 
@evelop power Curves for graphic comparisons. 

Jeg BROCEDURES USED 

Sample generation was accomplished with a Gaussian 
Hoomale Generation Program on file with the computer center 
at the Naval Postgraduate School. The program was 
gewelloped by Marsaglia, MacLaren, and Bray (15). The 
aulbmors Stated that in theory the Gaussian method riney, 
developed 15 completely accurate in that the procedure 
plplovede1eturned a random variable with exactly the 
PooMrtcimaiStri bution, and in practice the result is an 
approximation influenced only by the capacity (word 


Wemetn) Gf the computer used. 
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The accuracy of the random variables generated was 
Bested by studying the first four moments, mean, standard 
deviation, skewness, and kurtosis on 35 samples of 10,000 
mumbers cach. Each sample generated a distribution with 
normal characteristics. A 2 Goodness-of-Fit Test with 
Mine deprees of freedom and a 0.99 significance level 
meemalso uSed tO test the 35 samples. Using this test 
the samples were tested against a N(0,1) population and 
no significant differences resulted between any of the 
samples and this N(0,1) population. These investigations 
seemed to give adequate indication that the numbers being 
Memerated were from N({0,1) population. 

The actual method of obtaining the information called 
Eom the Study consisted of using the FORTRAN Program 
iewuded in Appendix A. In the program the sizes of the 
meyomcalpies were initially established. Sample sizes 
ranged from five to fifteen observations and nx and ny 
Gould) be set to any value within the range. Initially both 
samples were drawn from a N(0,1) population using the 
Cauissiagn Normal Generation Program. By multiplying each 
Goservation of one sample by a standard deviation value o, 
enamadding 4a conStant, c, to the result, the underlying 
Hopubation Of the sample was transformed into a desired 
normal population, N(c,o2). The two normals, N(0,1) and 


N(c,o2) now had a variance ratio of 1/o4 and a difference 
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in means equal to c. The two samples were then subjected 
Pompuircme-testeand the resulting t statistic was tabulated 
ser the appropriate rejection rates. This iteration was 
mreveda 50,000 times, At =the conclusion of the iterations 
the value for the difference in means was incremented, 
ehe Standard deviation value remained the sane, and another 
Gase With 50,000 iterations was performed. When all 
values for the differences in means had been exhausted, 

a new value for the standard deviation was read into the 
Maoetranednd the entire process repeated. This procedure 
maicecontimued until all desired variance ratios were 
pemerated. 

Mapilation, of the rejectionerates consisted of testing 
mremresulting t Statistic against appropriate critical 
wwe s. ss the particular critical values chosen were not 
Slyera £Unetion Of the desired qa level but also the number 
PecleO Tees O1 Ereedom for the particular case. The degrees 
of freedom for any case were equal to the total number 
of Be erat nos from bothmsamples Manus two, (1.¢., 
mht ny - Zio ihtisenumbermort degrees of freedom results 
hmeometme fact that there are m, - 1 independent deviations 
Peom the mean in the first sample and ny - imine the second 


and a total of ny, + ny - 2 independent deviations from the 


mean to estimate the populations' variances. 
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Ve RESULTS 


A. PoE Dp “TRUEVSRESEGTION RATES 
Peeaqual Sample Sizes 

PiiesinitialMobveetive in this study was to 
mivestipate what effect a violation of the assumption 
of homogeneity of variances would have on the rejection 
mire, Or the t-test, at q = 0.05. At what k value would 
miemcestimated "true" rejection rate differ significantly 
prom the expected rejection rate? 

Initially the cases for equal sample sizes were 
studied. Samples of size five, ten, and fifteen were 
chosen. It was assumed that information gathered at 
these levels would cover the complete spectrum of possible 
MorTebes encountered 1m the wse of the t-test. Table Z below 
ves the results Of the estimated “true rejection rates 
Semtibe t-test Over the range of scale factors, when samples 


Gieecual Sizes Were used. 


Table Zz 
ESUINAITED “TRUE” REJECTION RATES FOR q = 0.05, 
EQUALS SAMPIEE SIZES 


k 1/9 ya) IS ss il a 5 7 


Sa. mick Oem. 0656920504 .9.0556 ..0494 .0542 .0600 .0662 
0) SN MOT eeeO5 SO 05125. 05400 50440 .0474 .0530 0616 
lk, LS FOSS GOS So mOS S40 0SS20486 .0514 .0536 .9486 
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Hicevaluesmeeiven in Table Z are the fraction of 
memeet2Ons Of 5,000 iterations in each case. With an ¢ 
Pivot mcls, tie expected rejection rate is exactly 0.05. 
Bven in the cases where all the assumptions are completely 
feeemei tea tire eCxpected rejection rate can only closely 
epproximate 0.05 because the number of rejections is a 
random variable from a binomial distribution with parameter 
Pao.) 6bme OCCUrence of a rare event has positive 
Probability and therefore small deviations from 0.05 
Mmoccll s1OruLne €xpected rejection rate. It can be seen 
Biateas k devidted from one in both directions, the 
estimated "true" rejection rate also increased with respect 
Mompeire a level of 0.05. This occurence was true for each 
of the equal sample sizes. As the sample sizes themselves 
mi@neased and more information was available to the t-test, 
Mmieme seemed to be a less rapid growth in the difference 
maween tire “true and Specified rejection rates. 

The k values in Table 2 were developed by setting 
Mieevdriance Of the Y population equal to one and then 
allowing the variance of the X population to change in 
GeaeGete ertect the desired variance ratio. This meant 
that even for equal sample sizes k values of k = 1/9 and 
kK = 9 were not exactly the same. For both scale factors 
the Magnitude of the ratios of the two population variances 


is the same but the pooled variance present in case k = 1/9 
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1s 5/9 and in the case k = 9 the pooled variance is S. 
Mptsmcame type Of ditference is present in other compli- 
Deimeany pairs of kK values, 1/3 - 3, 1/5 - 5, and 1/7 - 7. 
In observing the data though there appears to be no corre- 
Hation between the size of the pooled variance and a 
eniange in the estimated "true" rejection rate. It was 
Someluded that the primary cause for a change in the 
estimated "true’’ rejection rate was a change-in the scale 
Paecor value. 

The primary objective of the investigation was 
wowcetermine those values of k at which the estimated 
Mane rejection rate begins to differ significantly 
meonetic Specified q@ level. A Chi Square test with one 
fermcce Or -trecdom and a significance level of 0.975 was 
msed to determine the fraction, and number of "true" 
rejections that if achieved by the test, would imply that 
miewtwWwO rates could be considered significantly different. 


The “? Sedetculcewas developed from the case shown below. 


NUMBER OF CASES REJECTED NUMBER ACCEPTED 


OBSERVED A B 


Eee C LED 250 4750 
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Micmwe=peercad number of rejections, 250 comes from the fact 
that 5,000 iterations were performed for each case and the 
@ettrcal t value used produced a specified q level of 0.05. 
Merempercent Of 5,000 15 250, the expected number of 
meyections. 

Using a 0.975 significance level for the x2 test 
Meet that if the number of observed rejections, A, became 
@reater than 319 or less than 181, a significance difference 
between the estimated and specified rejection rates would 
memo lica. se lhrec hundred and nineteen 1s exactly 6.38 
MatectmEcro O00, and) 215 is exactly 3.62 percent of 5,000. 

Vitimepiescmechitical percentages of .0638 and 
mieoc and the data from Table 2, the following observations 
Gamebe made. For the sample sizes of five observations 
Dt wertetealevalue sot kKiwwhere the estimated ‘true’ rejection 
mieembecomes Sigmiticantly different from the specified 
fmiewappears tO occur for a k value between five and seven. 
Permeccualesample Sizes of E€rther 10 or 15 observations each 
enemsoucht after critical k value appeared to ies bevond 
feet was decided to conduct the investigation for 
these two equal sample sizes for k values between one and 
nine. 

The more detailed study was now conducted. For 
cCucimeEcaimle =ssizecseuor S, 105 and 15 observations the k 


ice clcom@moen (1,9), and (1,9) respectively were 
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mimcotmeated. ss ii each Case the variance ratio was incremented 
mECmONeCNtLOntice, Upper Limit of the interval in 0.25 steps. 
Pwmeaciescale factor value 50,000 iterations were performed. 
aotmeo0,000 1terations and an a equal to 0.05, the critical 
Mime lOrerejections became either 2718 or 2282. For any 

kK value producing a number of rejections greater or less 

Biome these two figures respectively, the implication would 
meowlt that the estimated “true" rejection rate was 
mrorircantiy different from the expected rejection rate. 

Memeicmoincmiame the 50,000 iterations produced 
Meapcect@Glurates vor the other specified a levels, 0.10, 
ioe 0,01, 0.001. With these rates it was possible to 
develop an empirical frequency distribution. By comparing 
Pimoweitpirical distribution with the t distribution it 
Was possible to determine, in a second manner, a critical 
k value where the two distributions became significantly 
interer ent. 

TieemesUbeseoreusing these two Criteria for 
Peetimeythe validity of the t-test for the various equal 
Suiplemstzes Under varying k values is contained in Table 3. 
The k values listed include all the pertainent information 


meeded sin the investigation. 
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labile 3 
PNP Gt Ye RESULTS FOR THE £-TEST WITH EQUAL SAMPLE SIZES. 
SO;OU0M ITERATIONS AT EACH: k VALUE 


n=n_ = 5 10 15 

= yp Criteria Ciraisterie Criteria 
k A B A B A B 
1.00 25 OZ JEN SSS a aN ZAZ OE 
25 2434 A Zs aN 2490 A 
es 50 2589 A EAS eI ZAC A ek 
ie 5 2520 4 25 Ak 7A75° oA 
2.00 2588 A 250 0a ok ILO Soy oN 
Pee? 5 2614 A 25/ ie Kk 24907 A 
250 2679 R 2564 en 25609 R 
2215 TAT Ss a 257) Ok 2557 eX 
oe 00 7819 5 R 25907 oN DSO IN 
Be 25 27 56k Zoo kK 2 S17 ae 
eo 0 ZO Z7 SO) OR ZOo27—° 
Se 5 Z904  R floes ESS 
4.00 2887 R 27 Uiguee kx 27 14) 
a. 25 2946 R 2686 R ZS 30 eek 
a 50 ZOS4 RR 27206 PRS) = IR 
ae 5 SOc OR ZT ZITA 26040  R 
5.00 3179 R IS Se Re 2607 ek 
Sos 27 See 2071, R 
a0 2845 7 R 2095 > 
5.75 26 Sleek 
6.00 2860 R 
GZS Zooor) kk 
6.50 TPIS Mae 
14 9 2683 R 
7. 00 Zi Sa RR 
eas 27 OC lk 
7250 GOSS I AR 
VRIES 2720) 
8.00 288. oR 
Sie o ILS (0 Jee 


Poet ctimarcds true’ number of rejections at 
Spec celevel ot 0,05, critical number 
DENS FOr 282 (a= 0.025) 


PeeOUGccOMcmGretecuing i, that the empirical 


dfscermvuemonecquals the t distribution 
(a= 90025) 
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In each of the cases of equal sample sizes, as the 
Seale factor k, increased the estimated number of "true" 
Boe ecrrons tor anea level of 0.05 also increased. For 
eailaiesample Sizes, £Live observations each a definite 
Keenitical value between 2.50 and 2.75 was determined where 
Miemcoullated| true: rejection rate differed significantly 
Meiimtne expected rejection rate of 2,500 rejections in 
SimugO 1terations. For samples of ten observations each 
fweimarderinitive break 1s not so evident. At k = 3.50 
Milestwo rejection rates are Significantly different while 
ier 5,75, 4.00, and 4.25 the rates are not signifi- 
@omelyeditterent. For k values greater than 4.25 the two 
mieesmane cOMsistently Significantly different. The assump- 
@uoneot the result at k = 3.50 is an extreme random 
Decturences results im concluding that the estimated "true" 
fmeyection vate begins to differ significantly from the 
expected rejection rate at a scale factor of k between 
weeoeand 4,50. Such a random occurrence is also assumed 
temmave Occurred in the case of 15 observations each and 
ae 0 lise particular case yielded rather inconclusive 
Tesults and it can only be determined that the critical k 
fem OU Mentor lies an the k range from 5.75 to 6.75. 

Micemesctltsmon using this less stringent requirement 


can be summarized in Table 4, 
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Table 4 
CRILPICGAL kK INTERVALS DETERMINED UNDER THE CRITERIA OF 
EQUME REJECTION RATES 


PONE oOAME LE otZES CRITICAL INTERVAL 
n= k* 
5 CD02 ao 
10 At ares 
15 Dao 0,7 6 


In evaluating the robustness of the t-test with 
irmeet fO a Sifpniticant difference between the developed 
pipirtcal distribution and the t distribution the resulting 
S@ratical kK intervals determined were less in all cases than 
fiemk intervals discussed in the previous paragraph. For 
the case n, = n, = 5, the k value where the two distributions 


x Y 


Meeamessiconmiricancly different occurred in the interval 


mee to 2.50. In the case n ie, the hypothesis 


ami «1 

that the two distributions were equal was accepted up to 
meemvelve between 5-00 and 3.25. A variance ratio greater 
Giemieo.2o produced a rejection of the hypothesis without 
Eecpirvon. In the case ny = Ny = PS5sucii san exact Kk 
Wimeuvalweould not be determined. Rejections of the 
mimgorwesis occurred at k equal to 2.50, 3,50, and values 
Mreater than or equal to 4.00. Assuming that this case 

is as robust as the case for ten observations in each 
colpreweence rejection at k = 2.50 could be considered 


Hime<tGeile tandem oceurrence, Because of the rejection 


Seminemnypouiiesis at kK = 5.50' no concise 0.25 k interval 
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PipeatcevOuckdot.swineretore it was only concluded that 
mieecritical kK value Sought after must lie in the interval 
Between k = 3.25 and k = 4.00. 

iMewuesultSwOL using this more Stringent requirement 


are summarized in Table 5 below. 


Wabike 5 
GelTICAn k INTERVALS DETERMINED UNDER THE CRITERIA OF 
BOUATs DSi RIBUTIONS 


POUT SAMPIE SIZES CRITICAL “INTERVAL 
n= hee 
5 Dee 5 2.550 
10 S00 5425 
dS 5.25-4, 00 


Even for the most stringent criteria and the 
smallest equal sample sizes, five observations, the k* 
found was between 2.25 and 2.50. This means that the 
Variances of: the two normal distributions under study can 
(meter sine ma@enatude by a factor greater than two and the 
feeest can Still give valid answers. Increasing the 
Sbsetvatlons to 15 in each sample allows the variances 
bomditter in magnitude by a factor of approximately four, 
Miaeiiemt=bestestiil continues to produce valid inferences. 
hecuelnieeche Stringency of the criteria for validity 
iletedsesuenesdesree Of vVidlation of the ee nipcitone ence 
HicmewenEcaimwitictana. Nith respect to estimated "true" 
Pejieechlonmerate and equal sample sizes this segment of the 


MimecmiCatelonminatcates that the t-test is extremely robust. 
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ae Unegualyoamp le Sizes 
Wetlenme72yenedqe predicted that for unequal sample 


sizes a violation of the homogeneity of variance assumption 
would result in a strong bias and invalidate the t-test 
Papidly. Wnequal samplle sizes were studied in the same 
Hemiieieas the equal sample size cases. Initially 5,000 
Meetations were performed to obtain an indication of what 
Penge Of kK values were needed to be included in a more 
@etailed study. These initial results are contained in 
able 6. 

Table .6 


ESTIMATED “TRUE'' REJECTION RATES FOR o = 0.05, 
UNEQUAL SAMPLE SIZES, a=] 


k 
ny ny 1/9 Ly 7 WES, Ilys 1 4 5 7 
ome OO7Iee 0870 2.0748 .0498 .0432 .0378 .03598 
Pome Zo ttIG ss 21056 .0844 .0504 .0290 .0240 .0242 
Peo toozenlooom 21270 .1074 .0462 .0180 .0154 .0140 
PCs oe 7 Ole 654.1126 .0526 .0162 .0114 .0082 
Peer fOoe oe hOZ05 0950 5.0782 .0512 .0330 .0294 .0274 
Peet oO 0708 0046 .0482 .0356 .0364 .0392 
i. 


oso OocG es 07 109 2.0490 .0324 .03560 .0350 


iiembtacwchanacter1stic of the test 15 evident from 
the data of Table 6. Remembering that k, the scale factor, 
is defined as ax / cy » the table shows that whenever the larger 
ow eiacetWemlargerm Variance, k = 3, 5, 7, or 9, the 
Ccmiaveda) tribe rejection rate is less than the specified 


rate. When the sample n, has the smaller variance, k = 1/3, 


x 
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Wome 7. or 179, the estimated “true™ rejection rate is 
Pueateretian the Specitied rate. This observation is true 
momcbiecdscs and 1S an actual data confirmation of Welch's 
mathematical conclusions. 

To explain this result, the formula for the 


mraratistic must be further examined where 


1S ae xy, 
2 2 
ie 1S zs (ny - I)sy 1 1 ey 22 
ean 
ee ny = n ny 


iiiportance 15 the first term of the denominator. This 
Gamiiity as called the pooled variance and is the critical 
emilee in explaining the results in Table 6. To obtain the 
@eserecd scale factor k the variance for the Y population was 
Mentalned at one and the variance for the X population was 
fimlowed tO) Vary to achieve the particular scale factor. 
Pomeany Of tie unequal sample cases in Table 6 with k = 1, 
iMiempeolled Variance term Of the t statistic came out to 
emectTiadimeaverage result, Now as k increased from one 
Pitousienine the sample variance of the X population, — 
alisomimereasea. This Caused the pooled variance term to 
@rcomimneredse and with the remaining term of the denominator 
Miceehemnumeraror remaining relatively constant the average 
MeTteloele sdeereased. AS the € statistic decreased a 


greater proportion of the results fell within the critical 
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interval (-t Teeandeche probability of a 


ee : 
CEI Give 
MmeEseatastre Greater than t critical decreased. The 


estimated “true” rejection rate therefore decreased. In 


an opposite manner, as 3 


decreased, k = il to 1/95 the 
Myetage t StaListive imcreased and a greater proportion 
Peamencenesults fell outside of the critical interval 
Pierro etitc Cstimated "true'' rejection rate to increase. 

In the pooled term the sample variances is weighted 
Ligh = 


yaaa 2)- Now for any particular k value, as n, 


meee cs theschange in the estimated “true" rejection rate 


iseaceelerated. As an example, for k = 3, in all the 
cases where ny = Guthewcseimated “true! rejection rate is 
fescethan the Specitied rate. Proceeding down the column, 


ds nl, increases the difference between the two rates is 
iMmerecasingly more pronounced. This is due to the increased 
meicht applied to Be as n, increases. 

titcmcane bias was investigated by developing 
Phewscale tactor k by a different method. In this instance 
ene variance for the X population was set equal to one and 
Ene variance of the Y population was allowed to vary in 
aecimeomdevclopethe desired scale factor values. The 
Sdilewiype Of bias Characteristics were obtained and are 
Tomelmerable 7,6 1n a majority of the data points. the 


Pidcewacms lightly more pronounced in each direction when 


ColparedscO similar points in Table 6 but they do not 
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epmedt tO be Significantly different. When the bias 
Sec deele seSstimated “true” rejection rate to be greater 
Eiamethe Specified level the bias was even greater in 
Miewedse- where a a> 1. This difference, though slight, 
between the two approaches can be explained. In Table 7 
mene smaller sample size ny, Vedhawi thon, the population 
Pienetne Changing variance. Statistically, this smaller 
sample provides less information about the underlying 
population, with the resulting mean standard deviation 
beimeg greater than the case where the sample variance of 
the larger sample is varied, thus the bias is more 


pronounced. 


Hab lem 
Poe Vee iil REJECTION RATES FOR a = 0.05, 
MNEOWALS SAMPLE SIZES; one = ] 


k 
ny Ny 1/9 ye ES 1/3 i 5 5 7 
8 Cer lesomelecor., 0S04 0/788) .0498 .0446 .0380 .0388 
10 Ome o eee lee Ze LIA 2a So4 0504 .05310 .0258 .0236 
a) CeeeeoUcemeecoocm. | 54001070 .0462 .6212 .0130 .0146 
ICs: riolomeleacor. loaZeni198 ).0526 .0172 .0112 .0078 
0 


Pave OS Om SO 0512 .0312 .0284 .0324 


Tiewexplaapgation of the bias characteristics 


diseussed for the case resulting in Table 6 also applies 


HOimmenhewmictnod sO: generating the scale factor in this case. 


itemcotemmesiltsenoldsimathat the greater the difference 


Detmicenmmcalpre Sizes the more pronounced the bias. 
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NMR Rr OoOOOoOOooOoOoOo oe 


Mimscanenmie On a) Critical k value in each of the 
unequal sample size cases, the initial 5,000 iteration test 


memedted that in every case except for nx = 8, n 6, the 


“= 
Semiiaeeds true, ssejTection rate became significantly 
different from the expected rejection rate at k values less 
@mm 5.00. therefore the initial k values tested for 
somooo 1térations ranged over the interval from 1/3 to 3. 
ieeany case indicated a critical k value existed outside 
Srethis interval then the range could be increased. From 
the results contained in Table 8 it is evident that no 


increase in the k range was necessary for any of the cases 


eiudied. 


Hable & 
Pvt y ReEsUtiS FOR THE t-TEST WITH UNEQUAL SAMPLE SIZES, 
S0e000 ML TERATIONS AT EACH k VALUE 


Nx “Ny 8-6 10-6 13-6 15-6 15-10 1S=1Z 
Culrcrlawenrteria Criteria Criteria Criteria Criteria 
k A BA BA BA BA BA B 
oi Ser om R 4428 Reseda Ra 59Z6 R596 5 eso lee R 
woos) 5516 Rea 2601 R Ses R 5547 Re 3651 Re osc R 
moo 5402 R 3949 R 4870 Ro S278 R 3660 Rez ied R 
meas 5250 R507 2 R 4434 R 4946 R 3455 Rao 00S R 
eo0O S110 R 36/72 R 4076 R 4460 R23 550 R 3004 R 
ode 2997 Reo 2 Re 3674 Ree 116 RS 6d R 2789 R 
oO0) 26 1 1 R>3060 R 3477 R 3591 R 2992 R 2825 R 
Aoug 927 26 RK 2766 R 2958 R 3034 R 2785 R 2633 R 
O80 Asi0ak hea 24 A 2418 he 2461 ees OZ fe 2440 A 
nee 1 i R 2196 Re 2OSZ RezcolsG Ree 50'Z R 2308 R 
o00> 32168 RoeZ ozo Relea Z Rey 22 R 2108 Re 23521 R 
e750 2173 R LS7z2 R 1504 R 1448 R 1958 R 2160 R 
S000) 2105 Re1792 R436 R1214 ReakSie-5 R 2088 R 


40 


ny 2 6-6 TOs NS =6 75-6 ILS Slo Veoh ae by 


B 


y Gwuhtcmiamen inner ia Urateria Criteria Criteria Criteria 
IK is BA BA Bex BA Be A 
Oeics) 20357 R 1644 R 1289 ReelelG Ss R 1769 R 2094 
250 2016 R 1563 Rea Reo R 1681 R 2146 
Pe75 1995 Res 5 R 1083 Reno o2 2 Reels SL Rezuis 
Soo 2008 R 1490 Ros Z R 854 1eZ2 R 2069 


MO=mectiniated true number of 
Rejeecmrons satesinegle a lieve] 
Of U0 S, critical’ number 
ZT Ser 2252 (a= 0.025) 

Bee eOULGOMNe sot teSting H- that 
Eicwe np irica L distribution 


equals thes t distribution 
(o= 0-025) 


Table 8 continued 


Dtieme ithe Criteria tor testing the validity of 
tee tooterOor dirtecrent k values the results indicated 
Picteror unequal Sample sizes the robustness of the t-test 
[imuOOn. —hOmrevery case the slight increase in k to a 
Zee or l,25 Galsed a violation of the criteria that the 
Polo ediceripacical diSstripittone and the t distribution 
(iEIotbe Sicniticantly different. The less restrictive 
wie omiameiaceuie estimated and Expected rejection rates 
Heomedialewase violated at k value very close to one. Only 
ieee scCase Ny - 15, Ny = IPecoullamaek Value ian the range 


imeomteril. ol) be tolerated by the test. 
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Wieccemcollesedcemonstrate rather emphatically 

Hewen s Predictions that for unequal sample sizes a 
violation of the homogeneity of variance assumption 
would result in a strong bias and invalidate the t-test 
moplaly. Ihe t-test was not able to withstand a violation 
Srethne assumption to any degree and the robustness of the 
Pewtnetiis Instance must be considered extremely poor. 
B. POWER 

Mitespower Of the t-test was investigated in a similar 
Heenod as the Type | error rate. Cases were studied for 
both equal and unequal sample sizes and various degrees 
Beevrolation. of the assumption of equal variances. The 
ice seciTome(p) Of accepting the null hypothesis when 
mimecctnlttesnould be rejected because the populations 
means are not equal was used to develop the power of the 
t-test, 1-8 and conclusions were made through comparisons 
Pemorapiie results. In all cases an a level of 0.05 was 
iced. | 

Pnesrrinary, question asked in the investigation was 
timer rect didaa Violation of the equal variance assump- 
tion have on the power of the test? Was a change in the 
DoVvcumdiacetilverciated to the decree of the violation or 
Gidmilerewexlstsa More important factor in determining 
MicmpewerwOtstne test<. AS discussed in Chapter 2 the 
power of any test is influenced by a combination of 


baeuous, Variances, and sample sizes. 
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ir Equal Sample Sizes 


iene stttostelistrated in Graph 1 are for equal 
moiele Sizes, 15 Observations each and are typical of each 
Semcne Other equal sample size cases of five and ten 
Meecervations. Data gathered for each of these cases are 
Semeadined in Table 9. Graph 1 indicates that as k increased 
ipevalue from one to nine the power of the test decreased. 
mittscmas a predictable result becatise of the increased 
Variance present in the X population. Also shown though 
immoraph | is the result that as k decreased from 1 to 
1/9 the power of the test increased. To explain this 


result it should be remembered that the desired k values 


2 


Y 


and programming as. equate sspecitic Values, This means 


were achieved by maintaining as constant and equal to l 
that as k increased from 1/9 to 9 the pooled variance 
(2-1) also increased, and as can be seen the power of 
the test decreased. In the range from k = 1/9 to k = 1 
tnere was a relatively small decrease in the power but 
Ptmicmexprainced by the fact that the variance of the X 
population had to increase in relatively small increments 
Pomieniteve the desired k yalues. Therefore in this range 
PiemsizenOm tic pOGCled yarlance increased only slightly. 
Power decreased appreciably in the range k = 1 
POmkeu- so specause OL the relatively large increases in the 
varmanecenOnrethiew\spoOpulation. The pooled variance also 
SMwtesmertcetelativyely large increase over this same 


range of k values. 
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The conclusion made from these observations is 
that a violation of the assumption of equal variances does 
Moeecdirectly intluence the power of the t-test. There is 
Mesteniticant diiterence in the power for k=1/9 and k=9 
even though the degree of violation is the same in both 
Sees. ine power Of the test is directly a function of 
fiemsize Of the pooled variance and the less the amount 
Orpeoolcd Variance the greater the power of the test. 

To emphasize the contention that the size of the 
Pooled Variance is the primary factor influencing the 
Peterwor the £-test, Graph Z is provided. Two sets of 
Gurves are plotted, There are two curves with scale 
mewors equal to 3 and 7 and they are compared to two 
curves (K) where the scale factor is equal to one and 
mvemerore no violation of the assumption exists but the 
miezemore both population variances are equal to 3 or 7, 
Romero the pooled variance 15 equal to 2. For K=1, 
Ay=Ax=3 EmewpoeMed Variance 15 equal to 3. The power 
Gieeme k=35 Curve 15 greater than for K=1 and the variances 
equal to 3, but this same curve (K=1) exhibits more power 
Piaieene curve k-/ which has a pooled variance equal to 4. 
iittanieioncurates that tie degree Gf violation of the 
assumption has little to do with determining the power of 
BiemECoGmcndeenatethe pooled Variance 15 the critical 


element in this determination. 
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Dwi ome tie sequal sample size cases the larger 
Micmscanpie Size the greater the power of the test for an 
Eduai Value of the pooled variance. This is a well 
focumented result. 

Z. Unequal Sample Sizes 

Welenmezo) hase Written that a strong bias exists 
in the t-test, when the assumption of equal variance is 
Violated, and the samples are not equal. This bias has 
PeomesnOown in the results of the estimated “true" 
jmewect ton tates above. This same bias carries over to 
Mites power Of the t-test under the same circumstances. 

Gravina snows sthespower curve which results for 
feetous k values, of unequal Samples size fifteen and 
Seow ine kK values were achieved by maintaining the 
Variance of Y equal to one and allowing the variance of 
Peevoeranee from 1/9 to 9. As in the case of equal sample 
oizeomine power Of the test 1s a function of the size of 
the pooled variance. 

It Shomldmbcemoucdm that im the range of k from 
1/9 to 1/3 the power of the test is extremely high but 
Hsmaecnreyed at the expense of an increase in the fraction 
Gfelype [ errors when the two population means are equal. 
femesexlsts a good example of the conflict that develops 
when the fraction of Type II errors is decreased to the 
POmemwicne the rate of Type [| errors becomes unacceptable. 
For k in the range 3 to 9 the power decreased with an 


micteagcminetine fraction of Type Il errors and as a 
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eonsequence the Type 1 errors decreased to a point where 
Micereyection rate becomes significantly different from 
Memexpected rate. Similar results were obtained for 
Mie Other unequal samples tested. 

Piso included in “Graph 3 1s 4 plot of the power 
= lvoe ali 


curve for equal sample sizes n =15, and k=l. In 


x y 


Semparing this curve to the similar k=1 curve for net 
ny=6, it can be seen that the power decreased because of 
ne loss of information due to the fewer observations 
G@btained for the Y population. 

Graph 4 shows two cases where the total number 
fewonpservations from both populations is about equal, 
Piemtnicmatliterence between the sample sizes is not equal. 
immone case the total number of observations is 19 with 


Wee eland nm =o, the difference between sample sizes being 


yy 
three. In the second case the total number of observations 


Peeve stoeand n= 6 and, therefore, the difference 


yy 
between sample sizes 1s 9, 

For k=l both cases have equal pooled variances 
and the power curves are almost identical. For k=1/7 the 


Saccwiesloj n=O has a smaller poolled variance than the 


yi 
fase wm ily, ny =8 and as a result has a slightly higher 
power curve. For k=5 the relative size of the pooled 
Variances 1s reversed and as a consequence the power curves 


are also reversed. 
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moti tineemainterent cases are compared, 
For k=l each of the cases has a pooled variance equal to 
one but the power curves are not identical because the 
total number of observations in each case are not equal. 
As the number of observations decreases, the power also 
Geereases. 

Ase thie degree Ofeviolation of the assumption 
was increased to k=5 the pooled variance in each case 
meno longer equal. For n,=15, n,=6, the pooled variance 


Xx ? y 


Memos Jo; Nn -15, n,=10 the pooled variance is 3.44, and for 


- of 


nl,=15, Jc pOOhcdm ul taneeslS@on05.) At k=5 the 


a Uy 
meuative relationship of the three power curves has 
changed somewhat from the case k=l. Under a changing 
Mevree, Or Violation of the assumption a larger number 
Gistotal observations causes a less rapid growth in the 
wizenOnr the pooled variance. This in turn results in a 
MeccomuaplaudeterlOracion Gf the power ot the test with 
an increasing degree of violation. 

in, alincases, Che power changed as a function of 
iilcmomzemor the pooled vardance. [he same conclusion as 
wasmmade In the Case of equal sample sizes can be made 
Wetemmthae Ene Mower Of the test 15 a function of the 
pooled variance rather than a function of the violation 


CleciceassuMmptlon Of Cquality Of variances. For unequal 


cailplessmzes though, the violation of the assumption causes 
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a marked bias, and this is reflected in the power curves 
by either an increase or decrease in the a region of the 


curve at the point where the population means are equal. 
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Table 10 
ReouUbtS FOR THE POWER OF THE t-TEST FOR 
DNEGUAL  OAMPEE SIZES AND VARIOUS k VALUES 
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VI. SUMMARY AND CONCLUSIONS 


This paper has investigated ae robustness of the 
Student's t-test under violation of the assumption of the 
Hemegenecity Ol Variances. The estimated “true” rejection 
Paeemancwtie estimated power Of the test have been studied 
fmomeeLne cases Of cqual and unequal sample sizes. Extensive 
use of computer simulation was made to conduct the study 
momecach area Of interest. 

Memeo mehSe, Vedmtiat thie determination of the point 
Mamiirehethicrestimated true rejection rate became 
Eieniaticantly different from the specified rate was 
Pepemdent Upon the criteria used. Two different criteria 
were established: 

Mmeelnemrtotale number sOuure;eections at a single a level 
Of 0). 05. 

B. The k value where the empirically generated 
Pit pmE1 On Decane Sipniticantly different from the tail 
Gis the  f distribution. 

tncdomilsOmobSenved that the criteria became more 
Teiniccimemond dititcult stOrsatisty from A to B. Consequently, 
pOrmmeinoecase, the k Critical) intervals decreased when criteria 


Peloeapolived instead of “criteria A. 


oy 





tab ke sit 
Pe SeONeROsUSTINESS OF t-IEST WITH RESPECT TO SCALE 
PAC TORS VALUES EQUAL SAMPLE SIZES 


Criteria Gira en ia 

Oe ny 
Ee S Ze O20) 5 2.25-2.50 
or 10 A25-4.50 500-3625 
e515 Bey Oe 3425-4. 00 


GComecernimne the estipated “true” rejection rates for 
“large” equal sample sizes of close to 15 observations 
each, it can be seen that even under the most stringent 
Simeceridad, the ratio of the two population variances can be 
Memwecentes.Z25 and 4,00 and the t-test will still provide 
an accurate statistical inference. Even at the small but 
equalssample sizes of five observations each, the magnitude 
Mimthe Varlanece ratio 15 great enough to imply that the 
mawesteis tairly robust with respect to Type I rejection 
mobeseWwien the assumption Of equality of variances 1s 
Penola ted ; 

The test loses its robustness dramatically when sample 
Sizessare unequal and a violation of equal variance occurs. 
Welch's predicitions have been verified by data generated 
by simulation. When the larger sample has the larger 
variance the difference between the two means tends to be 
underestimated and the estimated "true" rejection rate 


falls below the specified level. When the larger sample 
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niceticmsmaller variance the difference between the two 
means tends to be overestimated and the estimated "true" 
mevecuNoneratc will be greater than the specified level. 
With respect to power, the simulation has shown that 

the power of the test is a function of the pooled variance 
Gemehemtwo populations and that it is not directly related 
to the degree of the violation of the assumption. This 
conclusion is valid for both equal and unequal sample 


s1zes. 


Do 





APPENDIX A 


The following is a detailed description of the FORTRAN 
pLOgran Used in’ this investigation. A sample program for 
meoingle Case 1S contained on page 63. 

The first term, JDUMMY = 0 is the beginning seed 
needed to activate the normal random number generator. 
mie investigator must then enter the desired sample sizes 
moreNxX and NY. A quantity for the code name VAR is next 
réad into the program. VAR is the standard deviation that 
ese to be applied to one of the two samples to effect the 
G@esired Variance ratio. The VAR value is printed on the 
Semputecr OUtpUtE at this point in the program. 

The value for the variable name DMEAN is next read 
into the computer. This value establishes the desired 
difference in population means used in studying the power 
Gieenestese. In those instances when the estimated "true" 
feqeetion Tate Was investigated with the population means 
COUamweUMEAN was set equal to 10. A DO Ten IBS AVENE 
Piveteamanduwithinvcach cyele of the DO loop, the variable 
feo UNA CC. i PERLO LPEROS | LPEROZ, LEEROL, and LPEROO, 
Picimecomtapulate the empirical trequemcy distribution, were 
Se emequaleto zero. In studying the power of the test, the 
DOMLoop iancremented the ditference in the population means 


Symcmractor Of O0.59f0r cach cycle. 
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The program next enters the actual iteration DO loop 
which causes 50,000 different pairs of samples to be tested 
by the t-test. NX observations, drawn from a N(0,1) 
population, make up the sample representing the X population. 
Each of these observations is multiplied by the value VAR 
and then the value 10 is added. This causes the sample to 
appear to have been drawn from a N(10,VAR) population. 

NY observations are then drawn from the same initial 

N(0,1) population and make up the sample representing the 

Y population. To these observational values the value 
represented by the variable DMEAN, is added. This causes 
the NY sample to appear to have been drawn from a N(DMEAN,1) 
population. 

(DieweeiesemeWwO Samplesstie t-test is then used to test 
the hypothesis that two population means are equal. The 
Eoovleiieeapsolume value Ope the Observed t statistic is 
Set equal to the variable name ATOBS. ATOBS is then 
COMparedeaPainst appropriate Critical values of the t 
itirnelom. | lnese appropriate Critical values are 
PUMCEhOnisme: the desired @ level, 0.10, 0.05, 0.02, 0.01, 
ended ,ecl and the number of degrees of freedom for the 
BaiolesmbecIng tested, NX + NY - 2. When the ATOBS value 
PomQec MuCim tenance ianelcWlar eritical value, a rejection 
Otpene null hypothesis occurs at that qa level and the 
corresponding variable name associated with the particular 
Gmlemele of the empirical frequency distribution, is 
incremented by one. 
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The number of the 50,000 iterations in which the t-test 
eoncludes in accepting the null hypothesis, is tabulated 
by the variable name NUMACC. This is done for an a level 
Gtn0.,05. rom NUMACC the fraction of Type II errors is 
Calculated and also the power of the test. 

MietheecOne mis tonvots50,000 iterations for each case, 
NUMACC, 8 and the power of the test are printed out. Also 
the values of the empirical frequency distribution which 


meve been developed from the test results are printed. 
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FORTRAN SIV GeLEVEL 18 MAIN DATE = 70126 


DIMENSION X(15), Y(15) 


IDUMMY = 0 
NX = 15 
NY = 6 


Heme = 19 
READ(S,5) VAR 
5 FORMAT(F6.4) 

WRITE (6,21) VAR 

Dat FORMAT (//15X,F10.4//) 
DMEAN = 4.5 
DER OM ala 
DMEAN = DMEAN + .5 
NUMACC 
PPERUG 
LeERCS 
LPERO2 
JL JENE RGEL 
LPERGG 
DC 50 I = 
DC. ORs NX 

10 X(J) = GRN(CIDUMMY)*VAR + 10 
eee = Ga) 
DC 20 K = 1,NY 
Y(K) = GRN(IDUMMY) 


I 


i of i wl 
S59 SSS > 


1,50(CO 


Wot 


20 Y(K) = Y(K) + DMEAN 
COX ames aiken XBAR))**2 
Peele ile coa NY. YBAR) )** 2 
TLOW1 = SQRT((PCOLX +POOLY) /(NX+NY-2)) 
TLOW2 = SQRT((1.0/NX)+(1.0/NY)) 


TOBS = (XBAR-YBAR)/(TLOW1*TLOW2) 
ATOBS = ABS(TOBS) 
Pe LOpcmimee Ing25) CO TO 30 
LPER1O = LPER1O + 1 
iE ewe omehi 2 093) GOTO 30 
LPEROS = LPEROS + 1 
Ee ATORSG Me tiee 2559) 160 TO: 50 
LPERO2 = LPERO2 + 1 
Pee chon omens ecole GO TO. 50 
LPERO1 = LPERO1 + 1 
De eC TOrGee te 589s) GO TO 50 
LPEROO = LPEROO + 1 
GO TO 50 
30 NUMACC = NUMACC + 1 
50 CONTINUE 

BETA = NUMACC/5000.0 
POWER = 1.0-BETA 
WRITE (6,601) NUMACC,BETA,POWER 

60 FORMAT (110,10X,2F14.6) 
WRITE (6,61) LPER10,LPEROS ,LPERO2,LPERO1,LPEROO 

61 FORMAT (5110/) 

100 CONTINUE 
STOP 
END 


63 





HO 


i 


bi) SOS REPERENGES 


Par olcitm co. Ine libormation in Small Samples", 
PEocecducwOmmtne Canmbridee Philosophical Society, 
iO lant 4m pemb our Soo, December 1936. 

Ciandtwlewn UisStribumguen Related to Comparison of 
IWOmMe@tesslOnmmege:ttecrents | lhe Annials of 


Mamlemotied Weomibiemmes. Voecl, no. 4, p. 507-521, 
WecenberaLo50: 


Dicer ermoancenassey, Ff. J., Jr., Introduction 
POmotoerserecalennaiysas, 2d ed., MeGraw-Hill, 1957. 


isesieteek. A...) Inverse Probability’, Proceedings 
CweUNce Cano rlagecernilosopnical Society, v. 26, 
preo28- 555, Octomem 19350. 

MishereawheA..0 tnewhaducial Argument in Statistical 


PicEciee:, Ath moreuPenles. Vv. 6, p. 391-398, 
LOSS « 


Pee whew litemiiogie Of Inductive Inference", 


tent of the Romailmoamatisticaleysocicty, v. 98, 
eo Sa July 1955. 


Misictioekaeie se Uncertain Imberence’, Proceedings 
of the American Academy of Arts and Sciences, 
Wewepe 254-258. 1956. 

Wiciers okRwA.,) A Note On Fiducial Inference", The 


Lindi scmOnreMathematteal otatastics, v. 10, no, 4, 
pe ooo 5960, December 1959. 


Canon Die GuG.ne lest tor the Significance of the 
Difference Between Means in Two Normal Populations 
Having Unequal Variances”, Biometrika, v. 38, 
preeco2-250,) June) P9S!. 


Nemisis eandeGlaver >. Ps F., “Some Empirical 
[ict buttons ct ebivaraate sl! and Homoscedasticity 
Criterion M Under Unequal Variance and Leptokurtosis" 
Journal of the American Statistical Association, 


Veersc, nO. 504955, 046-1055, December 1963. 


Meepiit aoe it mOdueElNOnmeo otatistical Inference, 
Dae an NOsthrand Company ,arriunceton, New Jersey, 1962. 


64 





Mag 


Nae 


14. 


eS 


1G. 


iy 


ie. 


Le 


ZU 


2). 


Oe. 


Zon 


Kenya beemand: Keeping, E. S., Mathematics of 
Deo cleo arte lte 2d ed., D. Van Nostrand Company, 
Peimeetonmmew Jersey, Copy 1951. 


Poimianniemeemble,s LeSting Statistical Hypothesis, 
VOnimite, sa o0ns, inc... Néw York, 1959, 


Pid 2D lint rodaction tO Probability and 
ctaelioicora yee m@ambradee University Press, 1965. 

Manca ohiaemGe MacLaren. M. D., Bray, T. A., “A 
Fast Procedure for Generating Normal Random 


Vidoes  mwCOMmMUNIGatlons Of the ACM, vy. 7, 
pee t- 10, January 1964. 


Veeloughwak.o emourtand, J,, and Rosenberg, L., 
"Small Sample Behavior of Certain Tests of the 
Hypothesis of Equal Means Under Variance 
Heterogeneity 4) Biometrika, v. 47, 3 and 4, 


p. 345-353, December 1960. 


Neve alo Int rOdMerony Probability and Statistical 
Applications, Addison-Wesley Publishing Co., 1965. 


Crider meu wm. Ee, Probability and 
See oe ce eOT emo imeonre, Prentice-Hall, Inc., 1965. 


Tipe ba 8 SOneCodwe-ocample lests When the 
Variances are Unequal: a Simulation Study", 
Bieter tha jy O45 so 0amd 4, p. 679-683, December 1967. 


Nevin meonderecansoOngmbn o., On the Use and Inter- 
Puctartoneot Certain lest Criteria for Purposes of 


SEteUustlealelmrerence ssonometrika, v. Z0A, p. 175-240, 


Boo co4 July 1923. 


Lote ee dnidmecarsonaehe oe, “On the Problem of the 
MOctebtiteLetitmlestsuan otatistical Hypothesis”, 


WimelosoOpitca le bransdetlons OF the Royal Society of 
Wondonuvemcs lo(serues tts ips 289-337, March 1933. 
Ciao. and Pearson, 6.) o., “Contributions to the 


iineermy sor lestime Statistical Hypothesis, Part ["', 
Sedtlsticale Rescarch Memoirs, V. 1, p. 1-37, 1936. 


CVet sb oe Tie meOWwer=Of ther otudent*s t-Test", 


Nournal Of the American Statistical Association, 
OU SA = 5 55 larch 1905. 


oS 





24. 


Zo. 


20. 


27. 


Zo. 


8) 


50. 


Loe eo ander ttian sen. Anyexact Distribution 
Cee iemulcher-benrens-Welenm statistic for Testing 
the Difference Between the Means of Two Normal 
Populations With Unknown Variances", Journal of 


ane MOVs icaleoOochety, V. 25, series B, 
ms 7 685, 1961. 


Schertc Sele liewanalysis OL Variance, p. 331-369, 
Jehn Wiley § Sons, Inc., New York, 1959. 


Upp mm basic CONGeDES Of Statistical Inference, 
Vow orth totlLand Publishine Co., Amsterdam, 1961. 


ici mowe en hounver, of Interval Estimation, M. S. 
Thesis, University of Oklahoma, Oklahoma City, 1964. 


Weir, J. B., “Significance of the Difference Between 
Two Means When the Population Variances may be 
UmecMaly ee Nature, ven lo/,.p. 458, July 1900. 


Welch, B. L., "The Significance of the Difference 
Between Two Means When the Population Variances 
opesUnequall oS Dromettine, Ve 29, p. 550-302, 
Bebruary 1938. 


mellehm bok. ,  =x—[her Generalization of ‘Student's' 


Problem When Several Different Population Variances 
AGeminwelywed SopmOnetGina wey. o4, Pp. 28-35, 


66 





INTIAL DISTRIBUTION LIsT 


Defense Documentation Center 
Cameron Station 
Meexandriay. Virginia 22314 


Rubra yeecoden0) 212 
Navalerestgraduate School 
Monterey, California 93940 


Asst oprearessoer G. luck, 

Code Tk({thesis advisor) 
Department of Operations Analysis 
Naval Postgraduate School 
Momuerey. California) 95940 


CAriviarry A. Hadd Jr. , USMC 
Holmes Road 
Marine, Minnesota 55047 


Department of Operations Analysis, Code 55 


Naval Postgraduate School 
Monterey, California 93940 


67 





UNCLASSIFIED 


Secunty Classification 







DOCUMENT CONTROL DATA-R&D 


(Security classification of title, body of abstract and indexing annotation must be entered when the overall report is classified) 


2a. REPORT SECURITY CLASSIFICATION 
Unclasstittied 









1 ORIGINATING ACTIVITY (Corporate author) 


Naval Postgraduate School 
Memberey. California 93940 






26. GROUP 







REP ORTEMNCE. . 






Investigation of the Robustness of the Student's t-Test 
Under the Violation of the Assumption of Equality of Variances 





4 OESCRIPTIVE NOTES (Type of report and,inclusive dates) 


Master's Thesis; (December 1970 


S AUTHOR(S) (First name, middie initial, last name) 


Harry A. Hadd, Jr. 


6 REPORT DATE 7a. TOTAL NO. OF PAGES 7b. NO. OF REFS 
seme r 197 f 69 30 


Ba. CONTRACT OR GRANT NO 9a. CRIGINATOR'S REPORT NUMBER(S) 





6b. PROJECT NO. 


Cc. 9b. OTHER REPORT NO(S) (Any other numbers thet may be aesigned 
this report) 


. DISTRIBUTION STATEMENT 


This document has been approved for public release and 


Sebc meleSmUd Stil blitd Ones aml imited. 





Fi}. SUPPLEMENTARY NOTES 12. SPONSORING MILITARY ACTIVITY 


Naval Postgraduate School 
Monterey, California® 93940 


he es 


13. ABSTRACT 


Dic uobuotmess sor themouudent S t=tést Ps investigated 
Emer the violation of the assumption of equality of 
Mibuiicecws With tne aid Onecomputer Simulation, Type I 
and Type II error rates and the resulting statistical 
inference are studied and the effects of unequal variances 
VimercieeuLon Cates aid the mower of the test are determined. 
Pivieecmare determined on the degree of violation of the 
PqUiieyeot Variances that still leads to a satisfactory 
Eesubpewhen Ootudent'’s distribution 1s used. 


DAD over To (PAGE) UNCLASSIFIED 
S/N 0101-807-6811 68 Security Classification A-31408 





UNCLASSIFIED 


Security Classification 


: 
KEY wOROS 


Student's t-Test 
Robustness 


Reveection Rates 


FORM ) = ms - i 2, 
DD Jo"..1473 (sack) ae UNCLASSIFIED 


EL LT 
S/N 0101-807-6821 Security Classification A=31409 








Thesis 122451 
H1068 Hadd 
On. Investigation of 
the robustness of the 
student's t-test 
under the violation 
of the assumption of 
equality of variances. 


ii a ii 





